74 research outputs found

    Capturing Atomic Interactions with a Graphical Framework in Computational Protein Design

    Get PDF
    A protein's amino acid sequence determines both its chemical and its physical structures, and together these two structures determine its function. Protein designers seek new amino acid sequences with chemical and physical structures capable of performing some function. The vast size of sequence space frustrates efforts to find useful sequences. Protein designers model proteins on computers and search through amino acid sequence space computationally. They represent the three-dimensional structures for the sequences they examine, specifying the location of each atom, and evaluate the stability of these structures. Good structures are tightly packed but are free of collisions. Designers seek a sequence with a stable structure that meets the geometric and chemical requirements to function as desired; they frame their search as an optimization problem. In this dissertation, I present a graphical model of the central optimization problem in protein design, the side-chain-placement problem. This model allows the formulation of a dynamic programming solution, thus connecting side-chain placement with the class of NP-complete problems for which certain instances admit polynomial time solutions. Moreover, the graphical model suggests a natural data structure for storing the energies used in design. With this data structure, I have created an extensible framework for the representation of energies during side-chain-placement optimization and have incorporated this framework into the Rosetta molecular modeling program. I present one extension that incorporates a new degree of structural variability into the optimization process. I present another extension that includes a non-pairwise decomposable energy function, the first of its kind in protein design, laying the ground-work to capture aspects of protein stability that could not previously be incorporated into the optimization of side-chain placement

    Computational protein design with explicit consideration of surface hydrophobic patches

    Get PDF
    De novo protein design requires the identification of amino-acid sequences that favor the target folded conformation and are soluble in water. One strategy for promoting solubility is to disallow hydrophobic residues on the protein surface during design. However, naturally occurring proteins often have hydrophobic amino acids on their surface that contribute to protein stability via the partial burial of hydrophobic surface area or play a key role in the formation of protein-protein interactions. A less restrictive approach for surface design that is used by the modeling program Rosetta is to parameterize the energy function so that the number of hydrophobic amino acids designed on the protein surface is similar to what is observed in naturally occurring monomeric proteins. Previous studies with Rosetta have shown that this limits surface hydrophobics to the naturally occurring frequency (~28%) but that it does not prevent the formation of hydrophobic patches that are considerably larger than those observed in naturally occurring proteins. Here, we describe a new score term that explicitly detects and penalizes the formation of hydrophobic patches during computational protein design. With the new term we are able to design protein surfaces that include hydrophobic amino acids at naturally occurring frequencies, but do not have large hydrophobic patches. By adjusting the strength of the new score term the emphasis of surface redesigns can be switched between maintaining solubility and maximizing folding free energy

    De Novo Enzyme Design Using Rosetta3

    Get PDF
    The Rosetta de novo enzyme design protocol has been used to design enzyme catalysts for a variety of chemical reactions, and in principle can be applied to any arbitrary chemical reaction of interest, The process has four stages: 1) choice of a catalytic mechanism and corresponding minimal model active site, 2) identification of sites in a set of scaffold proteins where this minimal active site can be realized, 3) optimization of the identities of the surrounding residues for stabilizing interactions with the transition state and primary catalytic residues, and 4) evaluation and ranking the resulting designed sequences. Stages two through four of this process can be carried out with the Rosetta package, while stage one needs to be done externally. Here, we demonstrate how to carry out the Rosetta enzyme design protocol from start to end in detail using for illustration the triosephosphate isomerase reaction

    Role of conformational sampling in computing mutation-induced changes in protein structure and stability

    Get PDF
    The prediction of changes in protein stability and structure resulting from single amino acid substitutions is both a fundamental test of macromolecular modeling methodology and an important current problem as high throughput sequencing reveals sequence polymorphisms at an increasing rate. In principle, given the structure of a wild-type protein and a point mutation whose effects are to be predicted, an accurate method should recapitulate both the structural changes and the change in the folding-free energy. Here, we explore the performance of protocols which sample an increasing diversity of conformations. We find that surprisingly similar performances in predicting changes in stability are achieved using protocols that involve very different amounts of conformational sampling, provided that the resolution of the force field is matched to the resolution of the sampling method. Methods involving backbone sampling can in some cases closely recapitulate the structural changes accompanying mutations but not surprisingly tend to do more harm than good in cases where structural changes are negligible. Analysis of the outliers in the stability change calculations suggests areas needing particular improvement; these include the balance between desolvation and the formation of favorable buried polar interactions, and unfolded state modeling

    Modeling Symmetric Macromolecular Structures in Rosetta3

    Get PDF
    Symmetric protein assemblies play important roles in many biochemical processes. However, the large size of such systems is challenging for traditional structure modeling methods. This paper describes the implementation of a general framework for modeling arbitrary symmetric systems in Rosetta3. We describe the various types of symmetries relevant to the study of protein structure that may be modeled using Rosetta's symmetric framework. We then describe how this symmetric framework is efficiently implemented within Rosetta, which restricts the conformational search space by sampling only symmetric degrees of freedom, and explicitly simulates only a subset of the interacting monomers. Finally, we describe structure prediction and design applications that utilize the Rosetta3 symmetric modeling capabilities, and provide a guide to running simulations on symmetric systems

    SwiftLib: rapid degenerate-codon-library optimization through dynamic programming

    Get PDF
    Degenerate codon (DC) libraries efficiently address the experimental library-size limitations of directed evolution by focusing diversity toward the positions and toward the amino acids (AAs) that are most likely to generate hits; however, manually constructing DC libraries is challenging, error prone and time consuming. This paper provides a dynamic programming solution to the task of finding the best DCs while keeping the size of the library beneath some given limit, improving on the existing integer-linear programming formulation. It then extends the algorithm to consider multiple DCs at each position, a heretofore unsolved problem, while adhering to a constraint on the number of primers needed to synthesize the library. In the two library-design problems examined here, the use of multiple DCs produces libraries that very nearly cover the set of desired AAs while still staying within the experimental size limits. Surprisingly, the algorithm is able to find near-perfect libraries where the ratio of amino-acid sequences to nucleic-acid sequences approaches 1; it effectively side-steps the degeneracy of the genetic code. Our algorithm is freely available through our web server and solves most design problems in about a second

    Foldit Standalone: a video game-derived protein structure manipulation interface using Rosetta

    Get PDF
    Summary: Foldit Standalone is an interactive graphical interface to the Rosetta molecular modeling package. In contrast to most command-line or batch interactions with Rosetta, Foldit Standalone is designed to allow easy, real-time, direct manipulation of protein structures, while also giving access to the extensive power of Rosetta computations. Derived from the user interface of the scientific discovery game Foldit (itself based on Rosetta), Foldit Standalone has added more advanced features and removed the competitive game elements. Foldit Standalone was built from the ground up with a custom rendering and event engine, configurable visualizations and interactions driven by Rosetta. Foldit Standalone contains, among other features: electron density and contact map visualizations, multiple sequence alignment tools for template-based modeling, rigid body transformation controls, RosettaScripts support and an embedded Lua interpreter

    De Novo Enzyme Design Using Rosetta3

    Get PDF
    The Rosetta de novo enzyme design protocol has been used to design enzyme catalysts for a variety of chemical reactions, and in principle can be applied to any arbitrary chemical reaction of interest, The process has four stages: 1) choice of a catalytic mechanism and corresponding minimal model active site, 2) identification of sites in a set of scaffold proteins where this minimal active site can be realized, 3) optimization of the identities of the surrounding residues for stabilizing interactions with the transition state and primary catalytic residues, and 4) evaluation and ranking the resulting designed sequences. Stages two through four of this process can be carried out with the Rosetta package, while stage one needs to be done externally. Here, we demonstrate how to carry out the Rosetta enzyme design protocol from start to end in detail using for illustration the triosephosphate isomerase reaction

    Computational Design of a PAK1 Binding Protein

    Get PDF
    We describe a computational protocol, called DDMI, for redesigning scaffold proteins to bind to a specified region on a target protein. The DDMI protocol is implemented within the Rosetta molecular modeling program and uses rigid-body docking, sequence design, and gradient-based minimization of backbone and side chain torsion angles to design low energy interfaces between the scaffold and target protein. Iterative rounds of sequence design and conformational optimization were needed to produce models that have calculated binding energies that are similar to binding energies calculated for native complexes. We also show that additional conformation sampling with molecular dynamics can be iterated with sequence design to further lower the computed energy of the designed complexes. To experimentally test the DDMI protocol we redesigned the human hyperplastic discs protein to bind to the kinase domain of p21-activated kinase 1 (PAK1). Six designs were experimentally characterized. Two of the designs aggregated and were not characterized further. Of the remaining four designs, three bound to the PAK1 with affinities tighter than 350 μM. The tightest binding design, named Spider Roll, bound with an affinity of 100 μM. NMR –based structure prediction of Spider Roll based on backbone and 13Cβ chemical shifts using the program CS-ROSETTA indicated that the architecture of human hyperplastic discs protein is preserved. Mutagenesis studies confirmed that Spider Roll binds the target patch on PAK1. Additionally, Spider Roll binds to full length PAK1 in its activated state, but does not bind PAK1 when it forms an auto-inhibited conformation that blocks the Spider Roll target site. Subsequent NMR characterization of the binding of Spider Roll to PAK1 revealed a comparably small binding `on-rate' constant (<< 105 M−1 s−1). The ability to rationally design the site of novel protein-protein interactions is an important step towards creating new proteins that are useful as therapeutics or molecular probes

    Computationally Designed Bispecific Antibodies using Negative State Repertoires

    Get PDF
    A challenge in the structure-based design of specificity is modeling the negative states, i.e., the complexes that you do not want to form. This is a difficult problem because mutations predicted to destabilize the negative state might be accommodated by small conformational rearrangements. To overcome this challenge, we employ an iterative strategy that cycles between sequence design and protein docking in order to build up an ensemble of alternative negative state conformations for use in specificity prediction. We have applied our technique to the design of heterodimeric CH3 interfaces in the Fc region of antibodies. Combining computationally and rationally designed mutations produced unique designs with heterodimer purities greater than 90%. Asymmetric Fc crystallization was able to resolve the interface mutations; the heterodimer structures confirmed that the interfaces formed as designed. With these CH3 mutations, and those made at the heavy-/light-chain interface, we demonstrate one-step synthesis of four fully IgG-bispecific antibodies
    • …
    corecore